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Be It known that I. Gavin Stuart Peter Miller, a citizen of the 
United Kingdom, residing at 750 Sylvan Avenue, Apartment No. 4. City of 
Mountain View, County of Santa Clara, and State of California, and I. Eric 
Michael Hoffert. a citizen of the United States of America, residing at 45 1A 
Sanchez Street. City of San Francisco. County of San Francisco, and State of 
California, have invented certain new and useful improvements in 

OBJECT SELECTION USING HIT TEST TRACKS 



of which the following is the Specification: 
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Specification 



,1? 

OBJECT SELECTION USING HIT TEST TRACKS^ 



Field pf the I nvention 

The present invention relates generally to methods for selecting 
objects from a moving image sequence of digitized or synthesized images, 
and more particularly, to a technique for storing auxiliary data in an item 
buffer, along with a video track, so as to precisely identify objects which can 
be selected from within each frame of the video track. 

«H»f n»«ftrintlon of Prior Art 

Object selection methods allow a user to select an individual object 
from among a group of objects in an image. One approach to object 
selection centers around determining which line segment on a two- 
dimensional screen image has been selected by a user. Typically, these line 
segments are connected to form a polygonal region, but they may also not be 
connected at all. One method, called "cursor picking", which is described 
by J. D. Foley and A. Van Dam. in Fundamentals of Interactive Computer 
Graphics". Addison-Wesley Publishing Company. 1984. pps. 200-204. 
creates bounded extents, which can be checked using simple equations. 
Such a scheme, for example, would allow a user to select and modify the 
characteristics of a particular triangle on a screen, even though there may be 
many other objects, such as circles, trapezoids, and arbitrary polygonal 
regions, also visible on the screen. Another method for object selection is to 
have an object name associated with every object in a scene. To make an 
object an active selection, the user simply types in the object name that they 
want to select. This method has no geometric correspondence. 

Another technique typically utilized in interactive systems, such as 
Apple Computer Incorporated* HyperCard™ program, permits the user to 
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identify a rectangular bounding region on the screen with a particular 
object such as a button or field. The HyperCard program looks to see where 
' the cursor location is when a selection is made and. at that time, searches 
for the object (such as a button or field) that has a bounding rectangle at that 
location. If no bounding rectangle encloses the cursor location, no object is 
selected. Conversely. If there is a bounding rectangle which encloses the 
object, the corresponding object is selected. All of the above techniques do 
not allow for accurate object selection of arbitrarily complex boundaries and 
can be difficult to use when attempting to identify object boundaries 
precisely. 

Item buffers are generally used to speed up image synthesis 
algorithms, such as ray tracing or radiosity. They may also be used to 
identify single object three-dimensional surface areas for usage with 
interactive painting and lighting systems which manipulate two-dimensional 
images. When computing radiosity form factors, a hemi-cube algorithm is 
typically used to speed up the calculation. In this algorithm, five faces of a 
cube are rendered as item buffers which contain object tags. By counting 
the number of tagged pixels in the face images, the form factor is computed 
for a particular polygon when seen from the vertex of another polygon. A 
description of such a system is presented by Michael F. Cohen and Donald P. 
Greenberg. in "The Hemi-Cube: A Radiosity Solution for Complex 
Environments". Computer Graphics. #19, Vol. 3. July 1985. pp. 31-40. 

Ray tracing may be accelerated by scan-converting an "object tag 
image into an item buffer. Then, for each pixel, the ray from the camera 
corresponding to that pixel is assumed to intersect with the object whose 
tag is in that pixel. By using an item buffer the algorithm avoids performing 
any primary ray-object intersection tests. In this way. ray tracing is made 
more computationally efficient. A description of such a system is presented 
by Hank Weghorst. Gary Hooper, and Donald P. Greenberg. "Improved 
Computational Methods for Ray Tracing". ACM Transactions on Graphics. 
Vol. 3. No. 1. January 1984. pp. 52-69. 
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In "Direct WYSIWYG Painting and Texturing on 3D Shapes." by Pat 
Hanrahan and Paul Haeberli. Computer Graphics. Vol. 24 No . 4 Aug .199* 
pp. 215-223. a single three-dimensional object is rendered into an Id 
buffer" which stores the surface u-v values for the visible surface in *at 
pLl. When painting onto the image, the surface position and surface 
normal vectors are determined by examining the object id buffer and * en 
the result is used to shade the pixel as the texture maps are modified. This 
method allows a user to paint on an image in two dimensions and allows 
modification of the object geometry or lighting in three-dimensional . P «e 
The resultant modification is computed in three-dimensional space and 
*en calculated as two-dimensional screen pixels, which are selectively 
written into the visible screen buffer. 

n^^f R^mmflTV of t >»? Invention 

A preferred embodiment of the present invention comprises a method 
for labeling the pixels within a selected visual area of at least one image 
frame containing that visual area from a sequence of 

memory and operative to be displayed on an interactive display so that a user 
^ subsequently select the selected visual area on a pixel accurate, frame 
accurate basis. To label the selected visual area within an image frame, the 
s ne within that image frame is segmented to identify the selected visual 
1 each pixel within that selected visual area is then labeled wUh a, .area 
identifier which is unique to that selected visual area^ and the pixels 
containing the area identifiers axe mapped into an item buffer. The item 
luZ is then compressed and stored within a labeled portion of memory 
linked with the stored frame image from which One item buffer was derived, 
len a user subsequently selects a pixel within any f~rr . image of *e 
sequence of frame Images the pixel is decompressed within the labeled 
pcruon of memory corresponding to the p*el in the selected frame image 
to determine the area identifier for the selected pixel. This area identifier 
is then used for a number of purposes, such as to identify an area within the 
frame image corresponding to the selected pixel, or to cause some action 
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related to the selected pixel to be performed. 

Brief Desc ription of thr l>riwing 

nG. 1 is a block diagram illustrating a computer for use in conjunction 
with the preferred embodiment of the present Invention: 
FIG. 2a illustrates a single frame of a video track; 

FIG. 2b illustrates a single frame of a hit test track corresponding to 

the video track of FIG. 2a: 

FIG 3a illustrates a set of video tracks and sound tracks: 

FIG. 3b illustrates the same multi-track data as FIG. 3b. but includes a 

hit test track: 

FIG 4a illustrates the required contents of the user data section of a 
hit test track in accordance with the preferred embodiment of the present 
invention: 

PIG. 4b illustrates the optional contents of the user data section of the 

hit test track of FIG. 4a: 

FIG. 5 is a flow chart illustrating the interactive playback of a movie 
sequence utilizing hit test tracks in accordance with the preferred 
embodiment of the present invention: and 

FIG. 6 is a flow chart illustrating the creation of hit test tracks for 
multi-track movies in accordance with the preferred embodiment of the 
present invention. 

pttnfi y nescrlntlT v f ^ f erred Embodiment 

The personal computer L" becoming a more effective tool for 
presenting multimedia works everyday. Many of the techniques for 
presenting and using multimedia information in such computers are carried 
out in software, although hardware products could also be developed, albeit 
at much greater costs, for carrying out the same functions. With respect to 
the preferred embodiment of the present invention, hardware could also be 
developed which would implement the present invention, but software 
techniques operating in conjunction with the computer system 10 oi Figure 
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, are preferably utilized herein to most effective* implement the present 

'"""""he computer 10. such as an Apple Macintosh computer 
— edby^P^^^ 

a central processing unit 12. an input/ou p DOWerfu i enough to 

r^co^ 

rate 01 ai ^ used with acceptable 

memory, such as a hard disk storage device, a CD ROM. ^ ■* t 

network. Even with highly ^''^"^ "ZZ and hit test 

auxiliary storage would still be required for ™* lnc , ude some 

type of mass storage „HH 7 ed Pointing device 18 could 

types of fast access memory could also be utilized. Pom ^ g 

works , typicaily fonned from a series 
f ,»Tta« of visual information sequentially strung together for 

;laytl e ^ ™ ^ ^ ^otTof — da7a 
storage device as a « d » ^ k ^ 2a lllustrat es a single frame 
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utilized herein, "video frame" means any analog image frame or any digitized 
frame captured with a scanner or camera or created using a paint program 

or renderer. ... _ 

Figure 2b illustrates a single frame 34 of an image called an item 
buffer" which is stored as a compressed frame in a hit test track 
corresponding to the video frame 30 of Figure 2a. The frames 34 
corresponding to the hit test track, unlike frame 30 of the video track, 
would not be visible to a user on the display 20. Rather, the hit test track, as 
will be further explained below, is an auxiliary track of data which 
corresponds to the video track and which identifies (maps) the location of 
objects, or user defined areas, within the video track on a per pixel per 
frame basis. Although Figure 2b illustrates each of the numbered objects 36 
in frame 34 corresponding to an Identically shaped object 32 in the frame 
30 objects 36 in the hit test track could be created which correspond to 
any abstract user selected area in the frame 30. whether visible or not. For 
example if frame 30 illustrated a room with some paintings, an open 
doorway.' and a statue, it may be desirable to associate an object 36 from the 
hit test track with each of the paintings, the statute, and the abstract open 
area of the doorway. Regardless of the objects or areas selected by the user, 
the auxiliary hit test track of the present invention is most useful for what is 
commonly called "object picking", where the user of the computer 10 can 
select an object on the display 20 using pointing device 18 in any frame of a 
moving image sequence, thereby causing the system to initiate an action 
based on the selected object. The initiated action can be any of a large 
number of different actions, such as the playback of a separate multimedia 
work or the performance of a subroutine program. As will be further 
illustrated below, since the hit test data corresponds to visual objects on a 
per pixel, per frame basis, object selection is highly accurate. 

The present invention is ideally suited for use in a computer 10 
capable of operating multimedia type computer programs, such as a software 
program that is designed to manipulate various forms of media represented 
as a series of related temporal tracks of data (such as video, sound, etc.). 
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each of those tracks being operative to be offset by some fixed time from the 
other tracks. A set cf such tracks being herein referred to as a multi-track 
movie. A representation of a small multi-track movie is illustrated In Figure 
3a. which is comprised of a first set of video and sound tracks 40 and a 
second set of video and sound tracks 42. In each case, the temporal video 
track duration Is the same as the temporal sound duration. The second se' 
of video and sound tracks has a shorter duration than the first set and begins 
with a fixed time offset after the start of the first set. In Figure 3b. the same 
set of multi-track movie data Is represented, except that there Is also a hit 
test track 44 stored in the movie. In this case, the hit test track 
corresponds to the first set of video and sound tracks, has the same duration 
as the first set. contains the same number of frames as the video track of the 
first set. and identifies the location of objects in the Image sequences 
comprising the video track of the first set. 

It should be noted that- the video track and the corresponding hit test 
track will be. in the most general case, a sequence of moving images. 
However, it is also possible to use the techniques of the present invention on 
just a single image, in which case each track comprises only a single frame. 
It should also be noted that the hit test track need not be compressed using 
the same compression techniques as the video track and need not be stored 
at precisely the same resolution as the video track. The hit test track is 
preferably compressed using a lossless data or image compression technique 
which need not conform to that of the video track. In addition, if the video 
track happens to be highly compressed, it may make sense to use a 
subsampled. or coarser grid, version of the hit test track (such as 
subsampling on the order of 2:1 or 4:1). In such an event, on playback, the 
nearest available object identification value in the coarse grid version of the 
hit test track is used as tne object identifier. Although this alternative 
embodiment will not have the pixel accurate advantage of the full resolution 
hit test track, it still permits the user to select most objects in the scene at 
an acceptable level of precision. 
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With respect to the preferred embodiment of the present invenUon. 
any track of the multi-track movie has the option of having an associated set 
of auxiliary, hit test track, information. This auxiliary Information is typically 
stored along with the corresponding track of the multi-track movie as a set 
of tag size and data fields that are used to facilitate manipulation of the 
temporal data contained in that hit test track. Since these fields are user or 
application defined, they are referred to herein as "user data". This user 
data is static, meaning it doesn't change over time. The organization and 
content of the user data for a hit test track, shown generally as 50. is 
illustrated in Figure 4a. The hit test tag 52 is an identifier that designates 
the track as a hit test track. In the presently preferred embodiment of the 
present invenUon. the four character tag field is represented by the 
characters -HIT.-, wherein represents a space. The hit test track Is 
marked with this tag field to distinguish the hit test track from video data. 
Hence, when computer 10 is interpreting the track data, it will know to 
only use the hit test track to identify objects which lie in the video scene^ 
The next field in the hit test track 50 is the size of the data field 54, which 
indicates the number of bytes of information in the data field. 

The remaining portions of information contained in hit test track 50 
are within the data field, which is preferably comprised of video track 
identifier 56. 'compression format 58. pixel bit depth 60 and hit test data 
62 The video track identifier 56 describes the video track in a multi-track 
movie to which the hit test track 50 corresponds. Utilization of a video 
track identifier 56 allows the computer 10 to know which video track is 
used in conjunction with the hit test track. Such information can be 
important where there are a number of hit test tracks which refer to the 
same video track. Compression format 58 indicates the format utilized to 

compress the hit test data 62. 

As previously stated, although a number of different compression 
formats can be utilized for both the video track and the hit test data 62 the 
preferred embodiment for the hit test data 62 is lossless data encoding. 
There are a number of applicable methods of lossless encoding that may be 
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employed, including run-length encoding, quad-tree coding, or Huffman 
coding, all of which are well known in the art. By indicating the 
compression format, the computer 10 may readily determine how to 
decompress the hit test data. Pixel bit depth 60 indicates the pixel bit 
depth to which the compressed data is to be decompressed. This feature 
allows for correct interpretation of the word length of the hit test data 62. 
It should be noted that compact descriptions of the objects in the hit test 
track 50. other than compression techniques, can be utilized. For example, 
it may be desirable to store a geometric description of the objects in the hit 
test track 50. This list of geometric primitives for hit test regions would 
likewise correspond to each frame in the original video track. 

It should also be noted that hit test track 50 need not include all of 
the above -described portions in order to be fully operable. Rather than 
include an indication of the compression format 58 or the pixel bit depth 
60. there could be a default compassion format utilized by computer 10 
which automatically provided thaMnformation. For example, the present 
invention could take advantage of the compression formats offered by a 
software program which manipulates (including compression and 
decompression) multi-track movies, whereby computer 10 would 
automatically know to handle various types of data in accordance with 
various types of compression formats. 

In addition to the portions of information described above contained 
within the data field of hit test track 50. there are other portions of 
information which could be included therein. Two such portions are 
described below with reference now to Figure 4b. Object to string name 
mapping table 64 could be utilized to associate ascii string names with 
particular objects in the corresponding video or sound track. For example, 
it may be desirable to store the name "cube in the hit test track 
corresponding to every pixel of a video image containing a cube. A likely 
table construct would contain a list of a series of numbers and associated 
name strings, such as ((l.cube). (2.painting), O.chair). (4.blob). etc.). 
These names can then be passed on to a scripting environment for further 
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interpretatJcn or usage. Object to event mapping table 66 could likewise be 
utilized to associate events with particular objects. For example, it may be 
desirable to initiate the event "play movie scene 3" whenever a user uses 
the cursor on the display 20 under the control of the pointing device 18 to 
select a pixel containing a particular object. A likely table construct would 
contain a list of a series of numbers and associated event strings, such as ((1. 
-play movie X"). (2,-play sound Y~). O.'go to screen 10"). (4/play movie 
Z"). etc.). These events could also then be passed on to an interpretive 
scripting environment. 

Although discussed In greater detail with reference to Figures 5 and 6. 
the operati-.il of a computer 10 running a program which utilizes hit test 
tracks as part of a multi-track movie will now be briefly described. To 
determine when to access data in the hit test track, the program of 
computer 10 determines when a user has made a selection at a particular 
position on the screen of display 20 where the aforementioned cursor is 
displayed. The program then determines which frame of the video 
sequence is currently being displayed. At that point, the program 
interrogates each track of the multi-track movie to determine which track 
has the identifiers indicating it is a hit test track for the particular video 
track being displayed. Once the proper hit test track has been determined, 
the frame in the hit test track corresponding to the video frame currently 
being displayed is accessed and decompressed according to the particular 
compression format in which it is stored. 

During decompression, only the region at or surrounding the pixel of 
interest is decompressed. When the exact pixel for object selection is 
identtfed. its decompressed value Is returned to the program as the object's 
identifier. The object Identifier can then be used to map into a name table 
or event table if so desired. If the object identifier maps into a name table, 
an ascii string name Is returned to the program. If the object identifier 
maps into an event table, the -event" is returned to the system, which can 
trigger the occurrence of various events, such as the playing of a sound, the 
display of a sequence of video frames or a picture image on the screen of 
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display 20. The event to be triggered and handled by the program, as 
mentioned above, is data contained in the event table. The meaning of those 
events will depend on what type of interactive environment is used on the 
program of interest. In the preferred embodiment of the present invention, 
events are to be interpreted by a high level scripting language. 

With reference now to Figure 5, a flow chart illustrating the interactive 
playback of a movie sequence utilizing hit test tracks in accordance with the 
preferred embodiment of the present invention will now be described. As 
the frames of a moving image sequence from a video track is played back by 
computer 10, block 70, the program tests to see if a mouse down event has 
occurred in the video frame, block 72. If a mouse down event has occurred 
in a video frame, versus some other portion of visible space on the screen of 
display 20, the video frame X is recorded in memory 16, as is the mouse 
position (mx.my) at the time of the mouse down event, block 74. If no 
mouse down event has occurred, the program returns to block 70 to 
continue playing the movie sequence. After storing the video frame X and 
the mouse down position (mx,my), the program proceeds to search the user 
data fields of all of the tracks of the multi-track movie for any track that has 
the hit test track identifier or tag **HIT_", block 76. 

When a track identified as hit test track has been found, block 78, the 
program reviews the user data of the hit test track to verify that the 
identified hit test track refers to the current video track being displayed, 
block 80. If the hit test track refers to the current video track, the program 
then determines the compression format Z, unless there is a default 
compression format and the bit depth at which to decompress the data, 
block 82. The next step in the process is to decompress the appropriate 
frame X (corresponding to the video frame X in the sequence) of the hit test 
track using the decompression method Z. Although the decompression that 
occurs can be of the full video frame X. it is preferable to just decompress 
the region surrounding the exact pixel location (mx,my) selected by the user 
in the video frame X, block 84. Note that the object identifier value for the 
selected object would be the same regardless of the pixel location within the 
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object selected by the user. Thus, while decompressing the entire object 
would certainly produce the appropriate object identifier, decompressing 
just the pixel location selected would achieve the same result. The object 
identifier value of the decompressed data at pixel location (mx,my) is then 
returned to the system, block 86. As previously described above, more 
complex optional versions of the above process can decompress the hit test 
data and use the object identifier to map into a table, which returns either 
an ascii name or an event to be triggered by the program. 

With reference now to FIG. 6, a flow chart Illustrating the creation of 
hit test tracks for multi-track movies in accordance with the preferred 
embodiment of the present invention will be described. In block 90, a 
newly digitized video frame or a rendered animation frame from a sequence 
of moving images is input. The program then looks to see if the input frame 
is from rendered animation or digitized video, block 92. If the input frame 
is from rendered animation, an item buffer is generated for the frame when 
rendering the images from the sequence, block 94. As previously discussed, 
this item buffer, which is later incorporated into the hit test track, is used 
as a map of all of the objects in the scene by labeling each pixel which is 
contained within the area defining that object with an item number or 
object identifier. Note that pixels within the same object or area of interest 
would contain the same object identifier. 

If the input frame is from digitized video, the objects in the video 
scene depicted in the video frame are segmented, using pattern recognition 
techniques or through manual object tracking, to generate an item buffer for 
that scene, block 96. Although pattern recognition techniques are less labor 
intensive than manual object tracking, the effectiveness of pattern 
recognition, and therefore object identification, can vary significantly 
depending on the subject matter being recognized. In addition, manual 
object tracking has the added advantage of allowing the user to specify 
"invisible" areas of interest in addition to visible objects of interest. 
Regardless of the type of input data, once an item buffer is created, each 
item buffer is then compressed using lossless compression, block 98. The 
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program then looks, block 100, to see if the video track corresponding to 
that item buffer is compressed at a ratio greater than a predetermined 
threshold, such as 10:1. As previously discussed, if the video track happens 
to be highly compressed, it may make sense to use a subsampled. or coarser 
grid, version of the item buffer, such as subsampling on the order of 2:1 or 
4:1. Hence, in block 102, a subsampled lower resolution version of the 
compressed item buffer is used in place of a normal resolution item buffer, 
as is utilized in block 104. Again, it should be noted that when a subsampled 
lower resolution item buffer is utilized, on playback, the nearest available 
object identification value in the coarse grid version of the hit test track is 
used as the object identifier. 

Regardless of the type of item buffer used, in block 106. the images in 
each item buffer are then stored as a compressed frame in the hit test data 
portion of a hit test track. The video frame corresponding to that hit test 
track is then stored in the video track, block 108. This process continues 
for each frame of the sequence of images until all frames of the sequence 
have been processed, block 110. at which point, the remainder of the user 
data, such as the hit test tag 52. the size of the data field 54. the video track 
identifier 56. the lossless compression format 58. and the pixel bit depth 
60. are stored in the hit test track 50, block 1 12. 

It should be noted that the present invention has many applications 
related to video display and manipulation technologies, such as the 
multimedia applications described above, but also in other areas, such as 
video games, where pixel accurate, frame accurate object selection would be 
desirable. Hence, although the present invention has been described with 
reference to Figures 1 through 6 and with emphasis on a particular 
embodiment, it should be understood that the figures are for illustration 
only and should not be taken as limitations upon the invention. It is 
contemplated that many changes and modifications may be made by one of 
ordinary skill in the art to the elements, process and arrangement of steps 
of the process of the invention without departing from the spirit and scope 
of the invention as disclosed above. 
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We Claim: 

]\ A method for labeling and subsequently identifying selected areas 
within images from a sequence of one or more images operative to be 
displayed by a display of a computer having a memory for storing the images, 
comprising the steps of: 

\ (a) identifying an area to be labeled within an image from said 
sequence of images; 

\b) labeling every pixel within said area with an area identifier 
which \s unique to said area; 

(A) storing each labeled pixel in a labeled portion of memory linked 

(d) \ repeating steps (a) through (c) for each selected area within 
each image from said sequence of images; 

(e) \ interrogating said memory in response to a user's selection of a 
pixel location within a selected area from a selected image displayed on said 
display to locate a labeled portion corresponding to said selected image; 

(f) evaluating said labeled portion corresponding to said selected 
image to locsite an area identifier corresponding to said pixel location; and 

(g) identifying said area identifier to said computer as an indication 
of said selecteck area. 

2. The method for labeling and subsequently identifying as recited in 
claim 1, and further comprising the step of (h) comparing said area 
identifier corresponding to Wid pixel location to a table containing 
additional information about said selected area. 

3. The method for labeling and subsequently identifying as recited in 
claim 2, wherein said additional information includes a name for said 
selected area. 
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4. \ The method for labeling and subsequently identifying as recited in 
claifri 2, whereiin said additional information includes an action to be 
performed by saiid computer which corresponds to said selected area. 

5. The method for labeling and subsequently identifying as recited in 
claim lV wherein said images are video frames from a sequence of video 
frames, Wnd wherein step (a) includes the step of segmenting each area to 
be labeled from within each video frame using a pattern recognizer. 

6. The Viethod for labeling and subsequently identifying as recited in 
claim 1, wherein said images are video frames from a sequence of video 
frames, and Wherein step (a) includes the step of manually segmenting each 
area to be labeled from within each video frame. 

7. The method for labeling and subsequently identifying as recited in 
claim 1, whereir\ said images are rendered frames from a sequence of one or 
more rendered frames, and wherein step (a) includes the step of rendering 
said rendered frames to identify each area to be labeled from within each 
rendered frame. \ 

8. The method f&r labeling and subsequently identifying as recited in 
claim 1, wherein said selected areas correspond to visual objects and 
abstract areas within sjiid images. 

9. The method for labeling and subsequently identifying as recited in 
claim 1, wherein step (b) Includes the steps of: 

mapping said area into an item buffer corresponding to said image; 

and 

assigning said area identifier to each pixel within said area to form 
labeled pixels within said item buffer corresponding to pixel locations 
within said image. 
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10. The method for labeling and subsequently Identifying as recited in 
claim 9, wherein step (c) includes the steps of: 

compressing said item buffer; 

storing said compressed item buffer in said labeled portion of 
memory; and 

storing an image identifier with said compressed item buffer in said 
labeled portion of memory to link said labeled portion of memory to said 
image. 

11. The method for labeling and subsequently identifying as recited in 
claim 10, whereiin said images are compressed for storage in said memory, 
wherein said item buffer is compressed using a lossless compression format. 

IE. The method for labeling and subsequently identifying as recited in 
claim 11, wherein a normal resolution item buffer is utilized when said 
imkges are compressed at a ratio equal to or less than a predetermined 
threshold and a subsampled low resolution item buffer is utilized when said 
imaVes are compressed at a ratio greater than said predetermined 
threshold. 

13. The method for labeling and subsequently identifying as recited in 
claim 10. wherein step (c) further includes the step of storing a labeled 
portion of memory identifier with said image identifier to distinguish said 
labeled portion of memory to said computer from other portions of said 
memory. 

14. The method for labeling and subsequently identifying as recited in 
claim 10, wherein step (c) further includes the step of storing a 
compression format indicator with said image identifier for indicating a 
compression format for said compressed item buffer to said computer. 
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15. The method for labeling and subsequently identifying as recited in 
claim 10, wherein step (c) further includes the step of storing a pixel bit 
depth indicator with said image identifier for indicating a bit depth for each 
labeled pixel to said computer when decompressing said labeled pixel. 

The method for labeling and subsequently identifying as recited in 
nm 10, wherein step (c) further includes the step of storing an area to 
nine mapping table with said image identifier containing a name for said 
selected area which is reported to said computer when said selected area is 
identified to said computer in step (g). 

17. Yrhe method for labeling and subsequently identifying as recited in 
claim uO, wherein step (c) further includes the step of storing an area to 
event mapping table with said image identifier containing an event 
corresponding to said selected area which is reported to and performed by 
said computer when said selected area is identified to said computer in step 

(g). 

18. A meMiod for aiding a user's selection of areas within images 
represented dv a series of related temporal tracks of image data stored in a 
memory of a Computer, the computer being operative to selectively display 
said images onV display of the computer by accessing the temporal tracks of 
image data from the memory, each of the temporal tracks of image data 
being operative fto be offset in the memory by a fixed time from other 
temporal tracks oftimage data stored in the memory, comprising the steps 
of: 

(a) identifyiiig an area within an image which could be selected by 
said user; 

, (b) labeling ea^h pixel within said area with an area identifier which 
is unique to said area; 

(c) storing each\ labeled pixel in a labeled track in said memory 
corresponding to said image; 
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(dk repeating steps (a) through (c) for each area within each image 
which could be selected by said user; 

(e) \ searching through said memory in response to said user's 
selection ofi a pixel location within a selected area of a selected image to 
locate a labeJed track corresponding to said selected image: 

(f) searching said labeled track for a labeled pixel corresponding to 
said pixel location; and 

(g) retrieving said area identifier corresponding to said labeled pixel 
from said memory to indicate said selected area to said computer. 

19. The method for aiding a user's selection of areas within images as 
recited in claim 18, and further comprising the step of: 

(h) comparing said area identifier corresponding to said pixel 
location to a table containing additional information about said selected area. 

20. The method for aiding a user's selection of areas within images as 
recited in claim 19, wherein said additional information includes a name for 
said selected area to be communicated to said user by said computer. 

21. The method for aiding a user's selection of areas within images as 
recited in claim 19, wherein said additional information includes an action 
identifier which corresponds to said selected area, said action identifier 
being operative to be utilized by said computer to perform an action related 
to said selected area. 

22. The method for aiding a user's selection of areas within images as 
recited in claim 18, wherein said areas include visual objects and abstract 
areas within said images. 

23. The method for aiding a user's selection of areas within images as 
recited in claim 18, wherein step (b) includes the steps of: 

mapping said area into an item buffer corresponding to said image; 
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and 

assigning said area identifier to each pixel within said area to form 
labeled pixels within said item buffer corresponding to pixel locations 
within said image. 

24. The method for aiding a user's selection of areas within images as 
recited in claim 23, wherein step (c) includes the steps of: 

compressing said item buffer; 

storing said compressed item buffer in said labeled track; and 
storing an image identifier with said compressed item buffer in said 
labeled track. 

25. The method for aiding a user's selection of areas within images as 
recited in claim 24, wherein said image data are compressed 
representations of said images, wherein said item buffer is a compressed 
representation of said area, said item buffer operative to be compressed by 
said computer using a lossless compression format. 

2G. The method for aiding a user's selection of areas within images as 
raited in claim 25, wherein a normal resolution item buffer is utilized when 
said images are compressed at a ratio equal to or less than a predetermined 
threshold and a subsampled low resolution item buffer is utilized when said 
images are compressed at a ratio greater than said predetermined 
threshold. 

27. The method for aiding a user's selection of areas within images as 
recited in claim 24, wherein step (c) further includes the step of storing a 
labeled track identifier in said labeled track to distinguish said labeled track 
to said computer from said temporal tracks of image data. 

28. The method for aiding a user's selection of areas within images as 
recited in claim 24, wherein said area identifier corresponding to said 
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labeled pixel is retrieved from said memory In step (g) by decompressing at 
least one of said labeled pixels from said item buffer, and wherein step (c) 
further includes the step of storing a compression format In said labeled 
track in such a manner so as to indicate a format to be utilized by said 
computer when decompressing said labeled pixels. 

29 The method for aiding a user's selection of areas within images as 
recited in claim 28. wherein step (c) further includes the step of storing a 
pixel bit depth indicator In said labeled track In such a manner so as to 
indicate a bit depth at which to decompress said labeled pixels. 

30 The method for aiding a user's selection of areas within images as 
recited in claim 24, wherein step (c) further includes the step of storing an 
area to name mapping table in said labeled track, said table Including a name 
for said selected area which is reported to said computer when said area 
identifier is retrieved from said computer. 

31 The method for aiding a user's selection of areas within images as 
recited in claim 24. wherein step (c) further includes the step of storing an 
area to event mapping table in said labeled track, said table including an 
event corresponding to said selected area which Is reported to and 
performed by said computer when said area identifier is retrieved from said 
computer. 

32 A hit test track for identifying and aiding in a user's selection of items 
within a multimedia work operative to be performed by a computer, the 
multimedia work being represented by a series of temporal tracks of data 
stored in a memory used by the computer, each of the temporal tracks of 
data being operative to be offset from other temporal tracks of data by some 
fixed time, the hit test track comprising: 

a data section operative to be utilized by said computer In identifying 
said items, said data section including an item identifier for each of said 
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items; 



a data section size Indicator to be utilized by said computer to denote a 
quantity of said memory occupied by said data section; 

a temporal track Identifier to be utilized by said computer to indicate a 
temporal track which corresponds to said data section; and 

a hit test identifier to be utilized by said computer to distinguish said 
hit test track from said temporal tracks. 

33 The hit test track as recited in claim 32. wherein said data section 
includes an item buffer corresponding to a group of said items, and wherein 
said Item identifiers for said group of items are stored within said item 
buffer. 

34 The hit test track as recited in claim 33. wherein said item buffer is 
operauve to be compressed by said computer using a lossless compression 
format prior to storage in said memory. 

35 The hit test track as recited in claim 34. wherein said temporal tracks 
of data are stored in said memory in a compressed format, wherein said 
item buffer is a normal resolution item buffer when said temporal tracks of 
data are compressed at a ratio equal to or less than a predetermined 
threshold, and wherein Item buffer is a subsampled low resolution item 
buffer when said temporal tracks of data are compressed at a ratio greater 
than s.\ld predetermined threshold. 

36 The hit test track as recited In claim 33. wherein said item buffer is 
operative to be compressed by said computer in a compression format prior 
to storage in said memory, and wherein said hit test track further comprises 
a compression format indicator to be utilized by said computer to determine 
said compression format. 
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37. The hit test track as recited in claim 33. wherein each of said items is 
an object operative to be displayed on a display of said computer when saio 
multimedia work is performed by said computer, wherein said item buffer 
includes a group of pixels corresponding to each of said items, wherein s?ld 
item buffer is operative to be compressed by said computer in a 
compression format prior to storage in said memory, and wherein said hit 
test tra^k further comprises a bit depth indicator to be utilized by said 
computer to indicate a compression bit depth for said group of pixels to said 
computer when said item buffer is retrieved from said memory. 

38. The hit test track as recited in claim 32. wherein said hit test track 
further comprises an item to name mapping table which includes a name for 
each of said items. 

39. The hit test track as recited in claim 32. wherein said hit test track 
further includes an item to event mapping table, each event corresponding 
to a selected item, each event being operative to be performed by said 
computer after said user selects an item. 
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A method for labeling the pixels within a selected visual area of at least 
one image frame containing that visual area from a sequence of irnatfc frames 
stored in memory and operative to be displayed on an interacts display so 
that a user may subsequently select the selected visual area on a pixel 
accurate, frame accurate basis. To label the selected visual area within an 
m age frame, the scene within that image frame is segmented to identify the 
selected visual area, each pixel within that selected visual area ,s then 
,abeled with an area identifier which Is-un.que to that selected visual area, 
and the pixels containing the area identifiers are mapped into an item 
buffer. The item buffer is then compressed and stored within a abc led 
portion of memory linked with the stored frame image from which the item 
buffer was derived. When a user subsequently selects a pixel within any 
frame image of the sequence of frame images the pixel 
within the labeled portion of memory corresponding to the pixel in the 
^ected frame image to determine the area identifier for the selected pixel. 
This area identifier is then used for a number of purposes, such as to 
identify an area within the frame image corresponding to the selected pixel, 
or to cause some action related to the selected pixel to be performed. 
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